Compensation data prediction

ABSTRACT

A method of predicting compensation data includes obtaining compensation data, associated with a job category, with at least one datum being associated with each of a plurality of characteristics associated with the job category, determining values of factors, associated with respective ones of the characteristics, and a base value that when used as operands of a function yield estimates of the obtained data such that relationships between the estimates and corresponding obtained compensation data satisfy at least one criterion, and using a portion of the values of factors and the base value by a computer to automatically obtain estimates of compensation data.

FIELD OF THE INVENTION

The invention relates to providing compensation data and moreparticularly to processing compensation data to determine and providepredicted compensation data.

BACKGROUND OF THE INVENTION

Employers, as well as persons seeking jobs, typically want to know howmuch compensation is appropriate and/or how much others in similarsituations are paid. The compensation may take many forms such as basepay, bonus pay, stock, stock options, incentives, benefits (e.g.,medical, dental, optical, etc.), and perquisites. How much and in whatform a person will be compensated is typically a driving influence inwhether the person is interested in obtaining and continuing in aparticular job. Thus, compensation information is often desired byindividual persons seeking jobs and entities offering jobs.

Data regarding compensation from which a person can accurately assessthe compensation provided for a particular job is often not available.For example, while compensation data reporting companies exist,employers may not provide compensation data to these companies at all,or not for a particular job of interest to an employer or a job seeker.Further, data may not be available for positions in a particular regionthat the seeker is interested in, or for the size of organization, theparticular job, or the industry sought by the job seeker. Also, whilecompensation data may be available, there may be so few matching orrelevant data points that the user is unable to determine whatcompensation should be expected.

SUMMARY OF THE INVENTION

In general, in an aspect, the invention provides a method of predictingcompensation data, the method including obtaining compensation data,associated with a job category, with at least one datum being associatedwith each of a plurality of characteristics associated with the jobcategory, determining values of factors, associated with respective onesof the characteristics, and a base value that when used as operands of afunction yield estimates of the obtained data such that relationshipsbetween the estimates and corresponding obtained compensation datasatisfy at least one criterion, and using a portion of the values offactors and the base value by a computer to automatically obtainestimates of compensation data.

Implementations of the invention may include one or more of thefollowing features. Using the portion of the values of factors and thebase value includes using each combination of values of factors forwhich values are determined. The method further includes derivingreference data using the obtained compensation data and the values offactors, and aggregating the reference data to determine the base value.The aggregating comprises averaging the reference data. The methodfurther includes comparing estimated compensation data with obtainedcompensation data of the same job and having the same associatedcharacteristics as the estimated compensation data, adjusting the valuesof the factors as appropriate depending upon results of comparing theestimated and obtained compensation data, and repeating the deriving,aggregating, using, comparing, and adjusting until comparing theestimated and obtained compensation data satisfy the at least onecriterion.

Implementations of the invention may also include one or more of thefollowing features. The characteristics are scope criteria associatedwith jobs. The characteristics include at least one of geographicregion, size of organization, industry, and seniority. The methodfurther includes comparing indicia associated with obtained data andestimated data respectively. The indicia are of estimated and obtaineddata. The relationships are differences between estimated and obtaineddata associated with the same job and same characteristics. The at leastone criterion is that the differences are within a magnitude limit. Therelationships are ratios between estimated and obtained data associatedwith the same job and same characteristics. The at least one criterionis that the ratios are within a magnitude limit.

Implementations of the invention may also include one or more of thefollowing features. The obtained data are associated with at least twojob categories, and the method further includes determining a job-to-jobfactor relating compensation data of a first job category tocompensation of a second job category, and applying the job-to-jobfactor to a selected datum of the first job category to determine adatum of the second job category. The selected and determined data arecompensation data. The selected and determined data are values offactors. The method further includes combining the estimates ofcompensation data and the obtained compensation data. The combiningincludes weighting at least one of the estimates of compensation dataand the obtained compensation data. The method further includestransmitting indicia of the estimates of compensation data via acommunications network to a destination for display at the destination.

In general, in another aspect, the invention provides a system forestimating compensation data, the system including a communicationsnetwork interface configured to be coupled to a communications network,a storage device configured to store compensation data, and a processorcoupled to the network interface and to the storage device andconfigured to: calculate a base value associated with the storedcompensation data by solving a relationship for reference data using thestored compensation data and predetermined values of training factorsassociated with scope criteria of the compensation data, therelationship relating the reference data, compensation data, and thevalues of the training factors, the processor further configured tocalculate the base value by combining the reference data to determinethe base value; and determine estimated compensation data by solving therelationship for the compensation data using values of the trainingfactors and the base value.

Implementations of the invention may include one or more of thefollowing features. The processor is further configured to compare theestimated compensation data and the stored compensation data and toprovide indicia of the comparison. The processor is further configuredto re-calculate the base value and re-determine estimated compensationdata using different values of factors if relationships between theestimated compensation data and the stored compensation data fail tosatisfy at least one predetermined criterion. The relationships aredifferences and the at least one predetermined criterion is whether thedifferences are within a threshold value. The relationships are ratiosand the at least one predetermined criterion is whether the ratios arewithin a threshold value.

Implementations of the invention may also include one or more of thefollowing features. The processor is further configured to determineinitial values of the factors by combining compensation data pointsassociated with a particular job to determine a neutral data point andcomparing the neutral data point with a data points associated with thescope criteria. The system further includes a user interface coupled tothe processor and configured to provide values of the factors. Theprocessor is further configured to combine the estimated and storedcompensation data. The processor is further configured to provide thecombined compensation data to the network interface. The processor isconfigured to receive indicia of weighting of the stored and estimatedcompensation data and to provide weighted combinations of the stored andestimated compensation data to the network interface for display on adisplay device coupled to the communications network.

In general, in another aspect, the invention provides a method ofpredicting compensation data, the method including collectingcompensation data for each of a plurality of jobs, at least one datumbeing associated with each of a respective plurality of associated scopecriteria for each job, deriving reference data for each of the pluralityof jobs in accordance with a function relating collected compensationdata, the reference data, and respective training factors indicative ofthe respective associated scope criteria, aggregating the reference dataassociated with each of a plurality of jobs to determine respectiveaggregated reference data, using a portion of the respective trainingfactors and the respective aggregated reference data to determinerespective estimated compensation data for each of the plurality jobs,comparing the respective estimated compensation data with collectedcompensation data of the same job and having the same associatedtraining factors, iterating the respective training factors asappropriate, and repeating the deriving, aggregating, using, comparing,and iterating until the compared estimated and collected compensationdata satisfy at least one comparison criterion.

Implementations of the invention may include one or more of thefollowing features. The using determines estimated compensation data forall combinations of associated training factors for each of the jobs.The method further includes determining at least one job-to-job trainingfactor associating compensation data of a first job to compensation dataof a second job, the method further comprising estimating compensationdata of the first job for a combination of training factors usingestimated compensation data of the second job with the same combinationof training factors.

Various aspects of the invention may provide one or more of thefollowing advantages. Compensation data may be estimated for jobs.Estimated compensation data for a job may be derived from actual datafor one or more other jobs having one or more characteristics that makethe actual compensation data relevant to the job whose compensation dataare estimated. Answers to compensation-data-related questions may beprovided in the absence of actual data providing the answers.Compensation data may be provided to inquirers indicating on how muchactual data the provided data are based. Actual and estimatedcompensation data may be combined to provide hybrid compensation data.Compensation data can be estimated in a methodological and consistentmanner. Compensation data estimates can be done in a consistent mannerand can be determined and provided to users with quick turnaround timesfrom data collection. Anomalies in surveys and other market compensationreports can be mitigated in a consistent manner so that collected andestimated data are individually and collectively consistent andreasonable.

These and other advantages of the invention, along with the inventionitself, will be more fully understood after a review of the followingfigures, detailed description, and claims.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a simplified diagram of a compensation system.

FIG. 2 is a simplified block diagram of a computer of a service providershown in

FIG. 1.

FIG. 3 is a block flow diagram of a process of collecting, training,estimating, combining, and presenting compensation data.

FIG. 4 is a block flow diagram of collecting compensation data.

FIG. 5 is a block flow diagram of training compensation data.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Exemplary embodiments of the invention provide techniques to collect,store, analyze, predict, project, and report compensation data.Compensation data can include base pay, bonuses, cash incentives,commissions, stock, stock options, benefits, perquisites, etc. Exemplaryembodiments of the invention can collect compensation data from avariety of sources. The compensation data can be segmented, e.g., basedupon various, scope criteria. Values of factors associated with thescope criteria can be derived from the collected data and adjusted tohelp predict compensation data. The factor values can be used to predictother compensation data for which no data were collected, and/or forwhich compensation data were collected. If compensation data werecollected, the collected compensation data can be combined withestimated data meeting the same or similar scope criteria. Compensationdata, be it predicted, collected, or a combination of these can bepresented to a user for analysis. Other embodiments are within the scopeof the invention.

Referring to FIG. 1 a system 10 includes a compensation data serviceprovider 12, employer/compensation data providers 14, 15, acompensation-data seeker 16, and a communication network 18, here shownas the Internet as an illustrative, but not limiting, example. Theservice provider 12, seeker 16, and data provider 14 are configured andcoupled to communicate with each other via the network 18. The othercompensation-data providers are configured to provide data to theservice provider 12 through other means, such as by mail.

The employer 14 and the other compensation-data providers 15 areconfigured to provide compensation data to the service provider 12. Theemployer/compensation data provider 14 can provide compensation data tothe service provider 12 electronically via the network 18. For example,in response to inquiries (e.g., by the service provider 12), theemployer 14 may provide data from a computer via the network 18 asshown. While only one employer 14 is shown, more employers 14 may be setup to provide compensation data regarding their employees, or otherpersons. Further, the other compensation-data providers 15 (that mayalso be employers) can respond to inquiries to provide compensation datato the service provider 12 in various forms, such as surveys 20, 22,reports 24, etc. The surveys 20, 22 may be in a form to be entered intostorage at the service provider 12 manually, such as the survey 20, ormay be in a machine-readable form such as a punchcard or a standardfill-in-the-bubble form 22 as shown that can be scanned.

Referring also to FIG. 2, the service provider 12 comprises a computer13 including a processor 30, memory 32, storage devices/media 34, adisplay 36, an interface 38, a keyboard 40, and a mouse 42 all coupledtogether with a bus 44. The storage devices/media 34 may include one ormore databases for storing large amounts of compensation data. Databasesmay also be disposed separately from the computer 13 and accessible viathe interface 38. The memory 32 may include Read-Only Memory (ROM),and/or Random-Access Memory (RAM). The storage devices/media 34 includehard disk drives, floppy disk drives, CD-ROM drives, DVD drives, and thelike. The display 36 is configured to provide visual indicia of dataentered into the computer 13, or processed by the processor 30. Thekeyboard 40 and the mouse 42 are configured for data entry andmanipulation. Other data entry and/or data manipulation devices may beincluded. The interface 38 is configured to transfer data to and fromthe computer 13 and the network 18, and/or to and from any other desireddevice, that may contain a database, that is properly connected to theinterface 38 (e.g., through a Local Area Network (LAN)). Otherconfigurations of the computer 13 are possible, e.g., without the mouse42, and/or including a touch-sensitive cursor control, etc.

The computer 13 can execute one or more software programs to processdata in accordance with features described below. In particular, thestorage devices/media 34 contains appropriate computer-readable andcomputer-executable software code instructions that can be read andexecuted by the processor 30 to perform below-described functions ondata.

Data Collection

The service provider 12 is configured to collect and store compensationdata from multiple sources. Compensation data may be individual employeecompensation data or aggregated data, and may be actual data regardingactual compensation and/or approximations of actual compensation data.Compensation data may be transmitted to the service provider 12 via thenetwork 18 from the employer/compensation data provider(s) 14, e.g., viaa web-based tool. Data may also be transmitted to the service provider12 from the data provider 14 and/or the other compensation-dataproviders 15 in other manners, such as via a software upload, a directelectronic feed, or other means of collecting data and/or loading datainto a database. Compensation data may also be provided via hard-copymaterials such as the form 20 and/or the report 24 and may be enteredmanually at the service provider 12, e.g., using data input apparatus40, 42 of the computer 13. Further, data may be provided inmachine-readable hard-copy form such as the form 22 and machine read atthe service provider. For example, the form 22 can be scanned andinformation provided to the computer 13 for storage. Received data maybe stored in the storage devices/media 34 of the computer 13 for furtherprocessing.

Loaded compensation data are manipulated and stored in a data libraryfor further processing. Loaded data are tested and validated againstknown data points, such as market facts, manually and/or by automatedprocedures, e.g., for internal consistency, and to help ensure accuracyand consistency of the loaded data. The loaded data are furtherconverted into a standardized format (e.g., mapping data categorized bya data supplier to standard categories for analysis (e.g., mapping“salary” to “base pay”). This conversion may involve altering providednumbers, e.g., by changing collected data to reduce/eliminate illogicaland/or nonsensical data (e.g., if collected average base pay exceedscollected average base plus bonus). The service provider 12 can storedata (e.g., the data that are, as appropriate, loaded, tested,validated, and converted) into a data library for further processingsuch as comparison, analysis, aggregating, and reporting as describedbelow.

Data Training

The service provider 12 is further configured to perform data trainingon collected compensation data. Data training involves, among otherthings, iteratively adjusting training factors to use known data toderive, estimate, and/or predict unknown data. The training factors arepreferably market-based factors associated with scopes of known data forvarious job categories. For example, known pay figures for various jobsmay have associated scopes of industry, seniority, geography,organization size, etc. Thus, for each job category there may be a setof training factors for the industry the job is in, whatcountry/state/county/town/municipality or other geographic regioncollected data are associated with, how big the organization is thatsupplied a datum (e.g., small, medium, large, or other categories ofsize), what seniority of the job (e.g., junior, mid-level, or senior, orAccountant I vs. Accountant II), etc. Thus, the factors may have limitedvalues indicative of classifications or ranges of the associated scopecriterion that may be associated with definitions (e.g., whatconstitutes a small vs. a large organization, a junior vs. a mid-levelperson, an Accountant I vs. an Accountant II, etc.). The values mayrelate to other classifications, such as what geographic region the datarepresents (this may be unlimited based on the level of detail desired).Any criteria may have a factor with a limited or an unlimited number ofvalues. The training factors represent impacts of various scopes oncompensation. The training factors may be dependent upon one or moreother training factors. Thus, for example, the training factor forgeographic region may be different depending upon whether the job is in,e.g., a large vs. a small organization. Training factors may, however,be the same for different jobs; for example, the geographic regionfactor may be the same for engineers and for teachers. Also,relationships between compensation indicia for different jobs may bedetermined to provide job-to-job extrapolation training factors.

The service provider 12 is configured to determine the training factorsfor data training. The factors can be determined using known collecteddata and a training function that relates reference data and thetraining factors to the collected data. The training function may takemany forms, such as a linear, polynomial, or multiplicative mathematicalfunction, etc. Preferably, the training function generally takes theform:

collected data=TF₁ *f(RD)[rel]TF₂ *g(RD)[rel]TF₃ *h(RD)  (1)

where TF_(n) are the training factors, RD indicates the reference data,f, g, and h represent functions, and [rel] indicates a relationshipbetween the terms, e.g., addition or multiplication. For example, thefunctions f, g, and h, etc. may be f(x)=g(x)=h(x) . . . =x and [rel] maybe addition for a linear relationship such that equation (1) becomescollected data=TF₁*RD+TF₂*RD+TF₃*RD . . . . Alternatively, f(x)=1,g(x)=x, h(x)=x², etc. and [rel] is addition for a polynomialrelationship such that equation (1) becomes collecteddata=TF₁+TF₂*RD+TF₃*RD² . . . . Further, f(x)=x, g(x)=1, h(x)=1, etc.and [rel]=multiplication may be used such that equation (1) is amultiplicative function, becoming:

collected data=RD*ΠTF_(n)  (2)

where n=1, 2, 3, . . . up to the number of training factors used.

The service provider 12 is configured to initially set the trainingfactor values. These initial values may be arbitrary (e.g., set to 1),and/or influenced by some or all of the collected data as determined bythe computer 13 and/or human input/judgment (preferably a personknowledgeable regarding compensation practices). These influences may bedetermined, e.g., by averaging collected data and comparing the averagewith individual data associated with a particular training factor, e.g.,geographic area, to determine the initial training factor for thatgeographic area. The average may be determined from all the collecteddata, but is preferably taken from a portion of the collected data thatis associated with a particular job (although the portion could beselected in other ways, e.g., randomly). For example, all collected paydata for the job category of “teacher” may be averaged, and a pay datumassociated with a “Boston teacher” divided by the determined average toyield the initial geographic training factor for “Boston.” The same canbe done using the average and a datum for an electrical engineer todetermine the initial training factor for electrical engineers, etc.

The service provider 12 is configured to test the initial values of thetraining factors, and adjust and test the training factors iterativelyas appropriate to determine refined training factors. The serviceprovider 12 can use equation (1), the initial training factors, and theactual collected data to determine the reference data corresponding tothe collected data. The service provider 12 can aggregate, e.g.,average, the determined reference data to determine aggregated referencedata and use the aggregated reference data and the initial trainingfactors in equation (1) to determine estimated compensation data. Theestimated compensation data can be compared with the actual collecteddata and the initial training factors can be adjusted based on thesecomparisons. Adjustments can be made automatically, e.g., by thecomputer 13, and/or by human input. The human input is preferably by aperson with knowledge and experience regarding compensationpractices/standards. The service provider 12 can repeat thiscomputation/adjustment cycle to help the estimated and actual dataconverge, e.g., satisfy at least one predetermined criteria such as theestimated data being within acceptable tolerance ranges of the actualcollected data (e.g., a result of subtracting and/or dividing estimatedand actual data is/are within specified magnitudes). Data that appear tobe anomalies may be disregarded and/or diminished and/or mitigated, etc.to help the estimated and actual data converge. The training factorswhen the iteration is terminated are the refined training factors.Job-to-job extrapolation training factors may be determined, e.g., bycomparing computed/collected compensation data (that are preferablyassociated with similar scopes where such data are available), or bycomparing aggregated reference data for different jobs. The serviceprovider 12 is configured to periodically re-compute the refinedtraining factors, e.g., monthly.

Compensation data can be estimated by the service provider 12 for thecombinations of scope criteria, regardless of whether compensation datawith the corresponding scope criteria were collected. The aggregatedreference data and the refined training factors can be used by theservice provider 12 in equation (1) to determine estimates for acomplete set of compensation data, i.e., for every combination oftraining factors for each job category. For example, using base payaggregated reference data for a teacher, and using equation (2), theservice provider can compute the base pay for a teacher in a smallschool in California by multiplying the aggregated reference data by theCalifornia geographic refined training factor, and by the smallorganization size refined training factor, for teachers. Data may beextrapolated between job categories, e.g., with a principal's base paybeing calculated from a teacher's base pay aggregated reference data byapplying a teacher-to-principal training factor (e.g., using equation(2), by also multiplying by this factor). Also, training factorsassociated with one job may be determined by applying the job-to-jobfactor to data (e.g., training factors) associated with another job.

Estimated data can be used by the service provider 12 to account foranomalies and/or biases that artificially affect the collected actualdata. The accounting may include weighting the data as discussed below.Collected actual data, however, are preferably not changed and theservice provider 12 continues to store the collected actual data as is.The provider 12 may store the estimated data and the collected dataseparately.

Data Interpolation

The service provider 12 is further configured to combine estimated andactual collected compensation data. Estimated data produced by theservice provider 12 is internally consistent based on a self-producedset of training factors. Combining the estimated data with actualcollected data may yield a “smoothed” data set that is a blend of factand estimated data. The service provider 12 may weight the estimateddata and the actual collected data and combine the weighted data toproduce an aggregated data set. For example, equal weights may beapplied to estimated and collected data. Weights may also be dependenton one or more factors such as source of the data, whether the data iscollected or predicted, the number of companies or individualsrepresented by a value (e.g., is a datum an average of data from fivesources or representing 20 persons, etc.), or other factors. Thecombination may take various forms such as an average of all datapoints, or by first averaging all collected data, then averaging anestimated data point and the average of the collected data. Weightscould include a zero-weighting to eliminate data from the combination,e.g., to eliminate estimated data in the presence of collected data.

Data Presentation

The service provider 12 can produce and provide a compensation datareport including estimated/predicted data and/or actual collected data.The service provider can arrange the compensation data into a desiredreport format and provide the report to the compensation-data seeker 16via the network 18. The data seeker 16 includes a computer and a user(not shown) of the computer. The user may be, e.g., a company that wantsto know what the current compensation practices are for certain jobs andcertain scope criteria so that the company can remain competitive toattract and retain employees. The user may also be, e.g., an individualwanting to know what to expect for compensation for certain jobs meetingcertain scope criteria (e.g., to help the individual with compensationexpectations and requests). The user can choose (e.g., using a web-basedtool if the network 18 is the Internet) to have compensation dataprovided by the service provider 12 in a variety of forms. For example,the user may choose to see a composite interpolated set of data, eachindividual collected value and/or estimated datum, the user's owninterpolated aggregation of the collected data and/or estimates (e.g.,the user can provide weights for combining collected and estimateddata), or other forms of the collected and predicted data. For instanceswhere multiple collected compensation data points exist for a particularset of scope criteria, the user may selectively weight the collecteddata points individually. The service provider 12 can provide a completeset of compensation data such that the user can see information for jobswith any combination of the training factors and thus any availablecombination of scopes of the jobs. Data presented can also includeindicia of estimated and actual collected data supporting the displayeddata. For example, the number of collected data points may be displayed,or indicia may be provided as to what portion of the presented data wereestimated and what portion were collected.

Operation

In operation, referring to FIG. 3, with further reference to FIGS. 1-2,a process 50 for collecting and processing compensation data using thesystem 10 includes the stages shown. The process 50, however, isexemplary only and not limiting. The process 50 can be altered, e.g., byhaving stages added, removed, or rearranged. The process 50 is for theexemplary, not the only, situation in which the service provider 12 usesequation (2) for determining estimated compensation data, and averagesreference data to determine the aggregated reference data of thedetermined reference data. As shown, the process 50 includes collectingcompensation data, storing collected compensation data, training thestored data, estimating compensation data, interpolating the storedcollected data and the trained data, and presenting the interpolateddata.

At stage 52, the service provider 12 collects compensation data.Referring to FIG. 4, the data collection stage 52 includes thesub-stages shown. The sub-stages shown, however, are exemplary only andnot limiting. Sub-stages can be added, removed, or rearranged.

At sub-stage 70, published or other collections or aggregatedcompensation data, and individual employer compensation data, are sentto and/or acquired by the service provider 12. The data may be sent,e.g., in response to inquiries, to the service provider 12, or areacquired by the service provider 12, e.g., by soliciting responses orpurchasing collections of data. The data may be sent to the provider 12in various forms for manual and/or automatic entry, and may be forindividuals, combinations of individuals, etc.

At sub-stage 72, the provider 12 loads the received/acquired datamanually or automatically (e.g., with a punchcard reader, opticalscanner, etc.). The loaded data are screened for internal consistency byautomated procedures (e.g., a computer program) and/or with human reviewto analyze the data for illogical and/or nonsensical data. The serviceprovider 12 converts the data into a standardized format, possiblyadjusting data values to be stored to reduce and/or eliminate illogicaland/or nonsensical data, or other data chosen to be adjusted. Stage 52ends by proceeding to stage 54 shown in FIG. 3.

At stage 54, the service provider 12 stores the loaded, tested,converted compensation data. The data are stored in a data librarydatabase of the storage devices/media 34 of the computer 13. The dataare stored in a manner to be accessible for further processing.

At stage 56, the service provider 12 performs data training on thestored data. Referring to FIG. 5, the data training stage 56 includesthe sub-stages shown. The sub-stages shown, however, are exemplary onlyand not limiting. Sub-stages can be added, removed, or rearranged.

At sub-stage 80, the service provider 12 determines the initial trainingfactors. Using automated procedures and/or human input, the initialtraining factors are set. A person that is experienced/knowledgeableregarding compensation practices may set the values, or influence valuesdetermined, e.g., by determining a ratio of average collected data andone or more data points having a particular scope criterion. Sets oftraining factors are initiated for each job category desired. Trainingfactors are initialized for each of the scope criteria to derive initialtraining factors for the corresponding scope criteria and thecorresponding job category.

At sub-stage 82, the training factors are used to determine thereference data of equation (1). The collected data points (adjusted asappropriate) for each job category are put into equation (1) along withthe training factors, and the service provider 12 solves equation (1)for the reference data for the respective job categories. In thisexample, the equation (2) form of equation (1) is used, and thus thereference data=(collected data)/ΠTF_(n).

At sub-stage 84, the reference data derived in sub-stage 82 areaggregated, and the aggregated reference data are used in equation (1)to determine estimated data. Multiple reference data are derived insub-stage 82, preferably a datum for each collected data point. Thesereference data are aggregated, in this example averaged, to determinethe aggregated reference data. Aggregated reference data for differentjob categories may be compared (e.g., a ratio taken) to determine ajob-to-job extrapolation factor. The aggregated reference data and thetraining factors for each job are applied to equation (1), in thisexample taking the form of equation (2). Each combination of trainingfactors for the respective job is used in equation (2) with thecorresponding aggregated data to determine estimated compensation datafor the respective job for each combination of scope criteria. Thus,even if actual data have not been collected for certain combinations ofscope criteria, estimates can be obtained for these combinations usingthe aggregated reference data and the determined training factors,associated with that scope criteria combination, in equation (1).

At sub-stage 86, the service provider 12 compares the predicted datawith the collected compensation data. Collected and predictedcompensation data having similar scope criteria and the same jobcategory are compared. For example, a difference between, or a ratio of,etc., similar data may be determined.

At sub-stage 88, the provider 12 determines whether the predictedcompensation data are acceptably close to the collected compensationdata. In other words, the provider 12 determines whether the currenttraining factors adequately predict actual compensation data. If thecompared data are not within acceptable limits, e.g., differencesbetween them (or ratios of them) are greater than a threshold value,then the stage 56 proceeds to sub-stage 90, and if they are withinacceptable limits, e.g., differences are less than or equal to thethreshold, then stage 56 proceeds to sub-stage 92.

At sub-stage 90, adjustments are made to the training factors. Theadjustments can be made automatically, e.g., according to softwareinstructions in the computer 13, and/or manually, e.g., under influenceby a person with knowledge and/or experience in compensation practices.The adjustments are made to help the predicted data better correlate tothe actual collected data. The stage 56 returns to sub-stage 82 and theadjusted training factors are used in determining new reference data,etc., to re-determine predicted compensation data. This loop ofsub-stages 82, 84, 86, 88, and 90 repeats until the predicted and actualdata meet one or more desired criteria.

At sub-stage 92, the training factors are stored and any remainingestimated data are predicted. Preferably, at sub-stage 84, compensationdata are estimated for all combinations of training factors. It ispossible in sub-stage 84 above, however, to calculate estimatedcompensation data for fewer than all possible combinations of trainingfactors (e.g., only those combinations for which actual collected dataexist), leaving some combinations of data potentially unpredicted. Inthis case, at sub-stage 92, the remaining compensation data estimatesare made. The stage 56 ends by proceeding to stage 58 shown in FIG. 3.

At stage 58, estimated compensation data are stored. The serviceprovider 12 stores the compensation data estimated at stage 56 in thedata library database of the computer 13 for further processing and/ordisplay.

At stage 60, the service provider 12 performs data interpolation on thecollected and estimated data. The service provider combines collectedcompensation data stored at stage 54 and estimated compensation datastored at stage 58 that are for the same jobs and that have the same setof scope criteria (as indicated by the training factors associated witheach piece of estimated data). The data may be combined in a variety offashions, e.g., by averaging them, by averaging all compensation dataand then averaging the averaged compensation data and the estimateddatum, by determining a weighted average of the data, etc. The weightingmay even eliminate the actual or the estimated data (i.e., if aweighting is set to zero).

At stage 62, the combined actual and estimated compensation data arepresented to the compensation-data seeker 16. The data may betransmitted to the seeker 16 electronically via the network 18. The dataare sent by the service provider 12 in a format to help the data seekeruser easily understand the data. The user can select to view the data ina variety of forms, e.g., by entering desired information into thecomputer of the data seeker 16. For example, the user may choose to viewonly collected data (either an average or individual values), onlyestimated data, or a combination of these (that may be weightedaccording to a user's desired weighting), etc. The user may also chooseto individually weight actual data points where actual data points werecollected. The service provider 12 preferably provides information as tohow many actual collected data points were used in data provided to thedata seeker 16, and how much of the data provided was estimated versuscollected.

Other Embodiments

Other embodiments are within the scope and spirit of the appendedclaims. For example, due to the nature of software, functions describedabove can be implemented using software, hardware, firmware, hardwiring,or combinations of any of these. Features implementing functions mayalso be physically located at various positions, including beingdistributed such that portions of functions are implemented at differentphysical locations. Also, the job category can be treated as a trainingfactor instead of producing sets of training factors for each jobcategory. The job category would serve as a training factor, and othertraining factors could be dependent upon the job factor. Also, atsub-stages 86 and 88, instead of comparing estimated and actualcompensation data and determining whether they adequately agree, thereference data and the aggregated reference data may be compared andadequacy of agreement checked. The training factors could be adjustedbased on correlation between the reference data and the aggregatedreference data, or still between actual and estimated compensation data.

1-4. (canceled)
 5. A method of predicting compensation data, comprising:determining, for each of a plurality of characteristics associated witha first job category, a training factor and a value associated with thetraining factor; determining a base value for the job category;identifying a function satisfying at least one criterion, the functionuses the base value, and the values of the training factors to estimatecompensation data; and automatically generating, using a computerhardware system, estimated compensation data using the function, thebase value, and at least a portion of the values of the trainingfactors.
 6. The method of claim 5, wherein a value of a first trainingfactor is dependent upon a value of a second training factor.
 7. Themethod of claim 5, wherein a value of a first training factor isdependent upon both a value of a second training factor and a value of athird training factor.
 8. The method of claim 5, further comprising:determining, for each of a plurality of characteristics associated witha second job category, a training factor and a value associated with thetraining factor.
 9. The method of claim 8, further comprising:determining a training factor between the first job category and thesecond job category and a value associated with the training factor. 10.The method of claim 5, further comprising: generating reference datausing the estimated compensation data and the values of factors, whereinthe base value is determined using an aggregation of the reference data.11. The method of claim 5, further comprising: comparing the estimatedcompensation data with collected data; and adjusting, based upon thecomparison, at least a portion of the values of the training factors.12. The method of claim 11, further comprising: repeating theautomatically generating the estimated compensation data using theportion of the values of the training factors having been adjusted. 13.A computer hardware system configured to predict compensation data,comprising: at least one processor, wherein the at least one processoris configured to initiate and/or perform: determining, for each of aplurality of characteristics associated with a first job category, atraining factor and a value associated with the training factor;determining a base value for the job category; identifying a functionsatisfying at least one criterion, the function uses the base value, andthe values of the training factors to estimate compensation data; andautomatically generating estimated compensation data using the function,the base value, and at least a portion of the values of the trainingfactors.
 14. The system of claim 13, wherein a value of a first trainingfactor is dependent upon a value of a second training factor.
 15. Thesystem of claim 13, wherein a value of a first training factor isdependent upon both a value of a second training factor and a value of athird training factor.
 16. The system of claim 13, wherein the at leastone processor is further configured to initiate and/or perform:determining, for each of a plurality of characteristics associated witha second job category, a training factor and a value associated with thetraining factor.
 17. The system of claim 16, wherein the at least oneprocessor is further configured to initiate and/or perform: determininga training factor between the first job category and the second jobcategory and a value associated with the training factor.
 18. The systemof claim 13, wherein the at least one processor is further configured toinitiate and/or perform: generating reference data using the estimatedcompensation data and the values of factors, wherein the base value isdetermined using an aggregation of the reference data.
 19. The system ofclaim 13, wherein the at least one processor is further configured toinitiate and/or perform: comparing the estimated compensation data withcollected data; and adjusting, based upon the comparison, at least aportion of the values of the training factors.
 20. The system of claim19, wherein the at least one processor is further configured to initiateand/or perform: repeating the automatically generating the estimatedcompensation data using the portion of the values of the trainingfactors having been adjusted.
 21. A method of predicting compensationdata, comprising: obtaining compensation data, associated with a jobcategory, with at least one datum being associated with each of aplurality of characteristics associated with the job category;determining values of factors, associated with respective ones of thecharacteristics, and a base value that when used as operands of afunction yield estimates of the obtained data such that relationshipsbetween the estimates and corresponding obtained compensation datasatisfy at least one criterion; and automatically obtaining, using acomputer, estimates of the compensation data using a portion of thevalues of factors and the base value.
 22. The method of claim 21,wherein the estimates of compensation data are obtained using eachcombination of values of factors for which values are determined. 23.The method of claim 21, further comprising: deriving reference datausing the obtained compensation data and the values of factors; andaggregating the reference data to determine the base value.
 24. Themethod of claim 23 wherein the aggregating includes averaging thereference data.